hive:数据库“行专列”操作 |
您所在的位置:网站首页 › hive string_agg › hive:数据库“行专列”操作 |
hive:数据库“行专列”操作---使用collect_set/collect_list/collect_all & row_number()over(partition by 分组字段 [order by 排序字段])
转载
mob60475707384d 2018-01-13 20:55:00 文章标签 Hadoop+Spark hive 字段 调优 mysql 文章分类 代码人生 方案一:请参考《数据库“行专列”操作---使用row_number()over(partition by 分组字段 [order by 排序字段])》,该方案是sqlserver,oracle,mysql,hive均适用的。 在hive中的方案分为以下两种方案: 创建测试表,并插入测试数据: --hive 测试 行转列 collect_set collect_list create table tommyduan_test( gridid string, height int, cell string, mrcount int, weakmrcount int ); insert into tommyduan_test values('g1',1,'cell1',12,3); insert into tommyduan_test values('g1',1,'cell2',22,3); insert into tommyduan_test values('g1',1,'cell3',23,3); insert into tommyduan_test values('g1',1,'cell4',1,3); insert into tommyduan_test values('g1',1,'cell5',3,3); insert into tommyduan_test values('g1',1,'cell6',4,3); insert into tommyduan_test values('g1',1,'cell19',21,3); insert into tommyduan_test values('g2',1,'cell4',1,3); insert into tommyduan_test values('g2',1,'cell5',3,3); insert into tommyduan_test values('g2',1,'cell6',4,3); insert into tommyduan_test values('g2',1,'cell19',21,3); 方案二:使用collect_set方案注意:collect_set是一个set集合,不允许重复的记录插入 select gridid,height,collect_list(cell) cellArray,collect_list(mrcount) mrcountArray,collect_list(weakmrcount) weakmrcountArray from ( select gridid,height,cell,mrcount,weakmrcount,row_number()over(partition by gridid,height order by mrcount desc) rn from tommyduan_test group by gridid,height,cell,mrcount,weakmrcount ) t10 where rn0 then cellArray[0] else '-9999' end) as cell1, (case when size(cellArray)>0 then mrcountArray[0] else '-9999' end) as cell1_mrcount, (case when size(cellArray)>0 then weakmrcountArray[0] else '-9999' end) as cell1_weakmrcount, (case when size(cellArray)>1 then cellArray[1] else '-9999' end) as cell2, (case when size(cellArray)>1 then mrcountArray[1] else '-9999' end) as cell2_mrcount, (case when size(cellArray)>1 then weakmrcountArray[1] else '-9999' end) as cell2_weakmrcount, (case when size(cellArray)>2 then cellArray[2] else '-9999' end) as cell3, (case when size(cellArray)>2 then mrcountArray[2] else '-9999' end) as cell3_mrcount, (case when size(cellArray)>2 then weakmrcountArray[2] else '-9999' end) as cell3_weakmrcount from ( select gridid,height,collect_list(cell) cellArray,collect_list(mrcount) mrcountArray,collect_list(weakmrcount) weakmrcountArray from ( select gridid,height,cell,mrcount,weakmrcount,row_number()over(partition by gridid,height order by mrcount desc) rn from tommyduan_test group by gridid,height,cell,mrcount,weakmrcount ) t10 where rn0 then split(cellArray[0],'_')[1] else '-9999' end) as cell1_mrcount, (case when size(cellArray)>0 then split(cellArray[0],'_')[2] else '-9999' end) as cell1_weakmrcount, (case when size(cellArray)>1 then split(cellArray[1],'_')[0] else '-9999' end) as cell2, (case when size(cellArray)>1 then split(cellArray[1],'_')[1] else '-9999' end) as cell2_mrcount, (case when size(cellArray)>1 then split(cellArray[1],'_')[2] else '-9999' end) as cell2_weakmrcount, (case when size(cellArray)>2 then split(cellArray[2],'_')[0] else '-9999' end) as cell3, (case when size(cellArray)>2 then split(cellArray[2],'_')[1] else '-9999' end) as cell3_mrcount, (case when size(cellArray)>2 then split(cellArray[2],'_')[2] else '-9999' end) as cell3_weakmrcount from ( select gridid,height,collect_set(concat_ws('_',cell,cast(mrcount as string), cast(weakmrcount as string))) as cellArray from ( select gridid,height,cell,mrcount,weakmrcount,row_number()over(partition by gridid,height order by mrcount desc) rn from tommyduan_test group by gridid,height,cell,mrcount,weakmrcount ) t10 where rn 本文章为转载内容,我们尊重原作者对文章享有的著作权。如有内容错误或侵权问题,欢迎原作者联系我们进行内容更正或删除文章。 赞 收藏 评论 分享 举报上一篇:Hive:有表A与表B进行inner join,如果A分组内包含有数据,使用A,否则使用B分组下的数据 Hive&SqlServerql:inner join on条件中如果两边都是空值的情况下,关联结果中会把数据给过滤掉 下一篇:数据库“行专列”操作---使用row_number()over(partition by 分组字段 [order by 排序字段]) |
CopyRight 2018-2019 办公设备维修网 版权所有 豫ICP备15022753号-3 |